Why Does Subsequence Time-Series Clustering Produce Sine Waves?

نویسنده

  • Tsuyoshi Idé
چکیده

Data mining and machine leaning communities were surprised when Keogh et al. (2003) pointed out that the k-means cluster centers in subsequence time-series clustering become sinusoidal pseudopatterns for almost all kinds of input time-series data. Understanding this mechanism is an important open problem in data mining. Our new theoretical approach (based on spectral clustering and translational symmetry) explains why the cluster centers of k-means naturally tend to form sinusoidal patterns.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Theorectical Analysis of Subsequence Time-Series Clustering from a Frequency-Analysis Viewpoint

Although Subsequence Time Series (STS) clustering is one of the most popular pattern discovery techniques from timeseries data, a mathematical methodology for analyzing STS clustering (or pattern discovery from time-series data) has attracted little attention. In the situation, it has had a surprising report [10] that cluster centers obtained using STS clustering closely resemble ”sine waves” w...

متن کامل

Selective Subsequence Time Series clustering

0950-7051/$ see front matter 2012 Elsevier B.V. A http://dx.doi.org/10.1016/j.knosys.2012.04.022 ⇑ Corresponding author. Tel.: +66 8 9499 9400; fax E-mail addresses: [email protected] (S. Ro chula.ac.th (V. Niennattrakul), [email protected] Subsequence Time Series (STS) Clustering is a time series mining task used to discover clusters of interesting subsequences in time series data...

متن کامل

Useful Clustering Outcomes from Meaningful Time Series Clustering

Clustering time series data using the popular subsequence (STS) technique has been widely used in the data mining and wider communities. Recently the conclusion was made that it is meaningless, based on the findings that it produces (a) clustering outcomes for distinct time series that are not distinguishable from one another, and (b) cluster centroids that are smoothed. More recent work has si...

متن کامل

A Review of Subsequence Time Series Clustering

Clustering of subsequence time series remains an open issue in time series clustering. Subsequence time series clustering is used in different fields, such as e-commerce, outlier detection, speech recognition, biological systems, DNA recognition, and text mining. One of the useful fields in the domain of subsequence time series clustering is pattern recognition. To improve this field, a sequenc...

متن کامل

Seismic Behavior of 2D Semi-Sine Shaped Hills against Vertically Propagating Incident Waves

This paper presents the preliminary results of an extensive parametric study on seismic response of two-dimensional semi-sine shaped hills to vertically propagating incident P- and SV-waves. Clear perspectives of the induced diffraction and amplification patterns are given by investigation of time-domain and frequency-domain responses. It is shown that site geometry, wave characteristics , and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006